AITopics | additional training

Collaborating Authors

additional training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ExPe: Exact Positional Encodings for Generative Transformer Models with Extrapolating Capabilities

Datseris, Aleksis, Vassileva, Sylvia, Koychev, Ivan, Boytcheva, Svetla

arXiv.org Artificial IntelligenceOct-6-2025

This paper introduces a novel approach to position embeddings in transformer models, named "Exact Positional Embeddings" (ExPE). An absolute positional embedding method that can extrapolate to sequences of lengths longer than the ones it was trained on. Traditional transformer models rely on absolute or relative position embeddings to incorporate positional information into token embeddings, which often struggle with extrapolation to sequences longer than those seen during training. Our proposed method utilizes a novel embedding strategy that encodes exact positional information by overriding specific dimensions of the embedding vectors, thereby enabling a more precise representation of token positions. The proposed approach not only maintains the integrity of the original embeddings but also enhances the model's ability to generalize to more extended sequences. In causal language modeling, our ExPE embeddings significantly reduce perplexity compared to rotary and sinusoidal embeddings, when tested on sequences longer than those used in training.

information, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2509.19569

Country: Asia > Middle East (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Retrieve-Augmented Generation for Speeding up Diffusion Policy without Additional Training

Odonchimed, Sodtavilan, Matsushima, Tatsuya, Holk, Simon, Iwasawa, Yusuke, Matsuo, Yutaka

arXiv.org Artificial IntelligenceJul-30-2025

Diffusion Policies (DPs) have attracted attention for their ability to achieve significant accuracy improvements in various imitation learning tasks. However, DPs depend on Diffusion Models, which require multiple noise removal steps to generate a single action, resulting in long generation times. To solve this problem, knowledge distillation-based methods such as Consistency Policy (CP) have been proposed. However, these methods require a significant amount of training time, especially for difficult tasks. In this study, we propose RAGDP (Retrieve-Augmented Generation for Diffusion Policies) as a novel framework that eliminates the need for additional training using a knowledge base to expedite the inference of pre-trained DPs. In concrete, RAGDP encodes observation-action pairs through the DP encoder to construct a vector database of expert demonstrations. During inference, the current observation is embedded, and the most similar expert action is extracted. This extracted action is combined with an intermediate noise removal step to reduce the number of steps required compared to the original diffusion step. We show that by using RAGDP with the base model and existing acceleration methods, we improve the accuracy and speed trade-off with no additional training. Even when accelerating the models 20 times, RAGDP maintains an advantage in accuracy, with a 7% increase over distillation models such as CP.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.21452

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator

Sakai, Yusuke, Goto, Takumi, Watanabe, Taro

arXiv.org Artificial IntelligenceJun-4-2025

We propose IMPARA-GED, a novel reference-free automatic grammatical error correction (GEC) evaluation method with grammatical error detection (GED) capabilities. We focus on the quality estimator of IMPARA, an existing automatic GEC evaluation method, and construct that of IMPARA-GED using a pre-trained language model with enhanced GED capabilities. Experimental results on SEEDA, a meta-evaluation dataset for automatic GEC evaluation methods, demonstrate that IMPARA-GED achieves the highest correlation with human sentence-level evaluations.

artificial intelligence, computational linguistic, natural language, (14 more...)

arXiv.org Artificial Intelligence

2506.02899

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)
Asia > Japan > Honshū (0.28)

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Learning with Expert Abstractions for Efficient Multi-Task Continuous Control

Jewett, Jeff, Saisubramanian, Sandhya

arXiv.org Artificial IntelligenceMar-18-2025

Decision-making in complex, continuous multi-task environments is often hindered by the difficulty of obtaining accurate models for planning and the inefficiency of learning purely from trial and error. While precise environment dynamics may be hard to specify, human experts can often provide high-fidelity abstractions that capture the essential high-level structure of a task and user preferences in the target environment. Existing hierarchical approaches often target discrete settings and do not generalize across tasks. We propose a hierarchical reinforcement learning approach that addresses these limitations by dynamically planning over the expert-specified abstraction to generate subgoals to learn a goal-conditioned policy. To overcome the challenges of learning under sparse rewards, we shape the reward based on the optimal state value in the abstract model. This structured decision-making process enhances sample efficiency and facilitates zero-shot generalization. Our empirical evaluation on a suite of procedurally generated continuous control environments demonstrates that our approach outperforms existing hierarchical reinforcement learning methods in terms of sample efficiency, task completion rate, scalability to complex tasks, and generalization to novel scenarios.

abstraction, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2503.14809

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Oregon > Benton County > Corvallis (0.04)
Europe > Bulgaria > Varna Province > Varna (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition

Duan, Yifei, Shang, Raphael, Liang, Deng, Cai, Yongqiang

arXiv.org Artificial IntelligenceFeb-28-2025

Language models can be viewed as functions that embed text into Euclidean space, where the quality of the embedding vectors directly determines model performance, training such neural networks involves various uncertainties. This paper focuses on improving the performance of pre-trained language models in zero-shot settings through a simple and easily implementable method. We propose a novel backward attention mechanism to enhance contextual information encoding. Evaluated on the Chinese Massive Text Embedding Benchmark (C-MTEB), our approach achieves significant improvements across multiple tasks, providing valuable insights for advancing zero-shot learning capabilities.

dataset, language model, retrieval backward attention, (9 more...)

arXiv.org Artificial Intelligence

2502.20726

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Vision-Encoders (Already) Know What They See: Mitigating Object Hallucination via Simple Fine-Grained CLIPScore

Oh, Hongseok, Hwang, Wonseok

arXiv.org Artificial IntelligenceFeb-27-2025

Recently, Large Vision-Language Models (LVLMs) show remarkable performance across various domains. However, these models suffer from object hallucination. This study revisits the previous claim that the primary cause of such hallucination lies in the limited representational capacity of the vision encoder. Our analysis reveals that the capacity of the vision encoder itself is already enough for detecting object hallucination. Based on this insight, we propose a Fine-grained CLIPScore (F-CLIPScore), a simple yet effective evaluation metric that enhances object-level granularity by incorporating text embeddings at the noun phrase level. Evaluations on the OHD-Caps benchmark show that F-CLIPScore significantly outperforms conventional CLIPScore in accuracy by a large margin of 39.6% without additional training. We further validate F-CLIPScore by showing that LVLM trained with the data filtered using F-CLIPScore exhibits reduced hallucination.

f-clipscore, hallucination, vision encoder, (13 more...)

arXiv.org Artificial Intelligence

2502.20034

Country:

Asia > Thailand > Bangkok > Bangkok (0.05)
Asia > Singapore (0.05)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(2 more...)

Genre: Research Report > New Finding (0.94)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

Importance Sampling via Score-based Generative Models

Kim, Heasung, Lee, Taekyun, Kim, Hyeji, de Veciana, Gustavo

arXiv.org Artificial IntelligenceFeb-6-2025

Importance sampling, which involves sampling from a probability density function (PDF) proportional to the product of an importance weight function and a base PDF, is a powerful technique with applications in variance reduction, biased or customized sampling, data augmentation, and beyond. Inspired by the growing availability of score-based generative models (SGMs), we propose an entirely training-free Importance sampling framework that relies solely on an SGM for the base PDF. Our key innovation is realizing the importance sampling process as a backward diffusion process, expressed in terms of the score function of the base PDF and the specified importance weight function--both readily available--eliminating the need for any additional training. We conduct a thorough analysis demonstrating the method's scalability and effectiveness across diverse datasets and tasks, including importance sampling for industrial and natural images with neural importance weight functions. The training-free aspect of our method is particularly compelling in real-world scenarios where a single base distribution underlies multiple biased sampling tasks, each requiring a different importance weight function. To the best of our knowledge our approach is the first importance sampling framework to achieve this.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.04646

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > New York (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.71)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

Impact of Extended Reality on Robot-Assisted Surgery Training

Bickford, Michael, Alruwaili, Fayez, Ragab, Sara, Rothenberg, Hanna, Abedin-Nasab, Mohammad

arXiv.org Artificial IntelligenceJan-22-2025

Robot Assisted Surgeries (RAS) have one of the steepest learning curves of any type of surgery. Because of this, methods to practice RAS outside the operating room have been developed to improve the surgeons skills. These strategies include the incorporation of extended reality simulators into surgical training programs. In this Systematic review, we seek to determine if extended reality simulators can improve the performance of novice surgeons and how their performance compares to the conventional training of surgeons on Surgical robots. Using the PRISMA 2020 guidelines, a systematic review and meta-analysis was performed searching PubMed, Embase, Web of Science, and Cochrane library for studies that compared the performance of novice surgeons that received no additional training, trained with extended reality, or trained with inanimate physical simulators (conventional additional training). We included articles that gauged performance using either GEARS or Time to complete measurements and used SPSS to perform a meta-analysis to compare the performance outcomes of the surgeons after training. Surgeons trained using extended reality completed their surgical tasks statistically significantly faster than those who did not receive training (Cohen's d=-0.95, p=0.02), and moderately slower than those conventionally trained (Cohen's d=0.65, p=0.14). However, this difference was not statistically significant. Surgeons trained on extended reality demonstrated a statistically significant improvement in GEARS scores over those who did not train (Cohen's d=0.964, p<0.001). While surgeons trained in extended reality had comparable GEARS scores to surgeons trained conventionally (Cohen's d=0.65, p=0.14). This meta-analysis demonstrates that extended reality simulators translated complex skills to surgeons in a low cost and low risk environment.

participant, simulator, surgeon, (12 more...)

arXiv.org Artificial Intelligence

2503.15503

Country: North America > United States > New Jersey > Gloucester County > Glassboro (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.69)

Industry:

Health & Medicine > Surgery (1.00)
Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

ULTra: Unveiling Latent Token Interpretability in Transformer Based Understanding

Hosseini, Hesam, Mighan, Ghazal Hosseini, Afzali, Amirabbas, Amini, Sajjad, Houmansadr, Amir

arXiv.org Artificial IntelligenceNov-15-2024

Transformers have revolutionized Computer Vision (CV) and Natural Language Processing (NLP) through self-attention mechanisms. However, due to their complexity, their latent token representations are often difficult to interpret. We introduce a novel framework that interprets Transformer embeddings, uncovering meaningful semantic patterns within them. Based on this framework, we demonstrate that zero-shot unsupervised semantic segmentation can be performed effectively without any fine-tuning using a model pre-trained for tasks other than segmentation. Our method reveals the inherent capacity of Transformer models for understanding input semantics and achieves state-of-the-art performance in semantic segmentation, outperforming traditional segmentation models. Specifically, our approach achieves an accuracy of 67.2 % and an mIoU of 32.9 % on the COCO-Stuff dataset, as well as an mIoU of 51.9 % on the PASCAL VOC dataset. Additionally, we validate our interpretability framework on LLMs for text summarization, demonstrating its broad applicability and robustness.

segmentation, semantic segmentation, transformer, (15 more...)

arXiv.org Artificial Intelligence

2411.12589

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Brandenburg > Potsdam (0.05)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pay Attention to What Matters

Silva, Pedro Luiz, de Domenico, Antonio, Maatouk, Ali, Ayed, Fadhel

arXiv.org Artificial IntelligenceSep-19-2024

Despite the remarkable success of Large Language Models (LLMs), they still exhibit a limited capability to align their outputs to the user instructions. In this work, we introduce a simple and effective method, which we name GUIDE, that mechanistically increases attention scores in instruction tokens. To support this operation, we present Influence, a novel metric that highlights how the user's instructions propagate through the transformer layers and impact the LLM output. Our results show that GUIDE improves the accuracy of following instructions 29.4 % to 60.4%, outperforming natural prompting alternatives and Supervised Fine-Tuning up to 1M tokens.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2409.19001

Country:

North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback